Auxiliary Loss for BERT-Based Paragraph Segmentation

نویسندگان

چکیده

Paragraph segmentation is a text task. Iikura et al. achieved excellent results on paragraph by introducing focal loss to Bidirectional Encoder Representations from Transformers. In this study, we investigated Daily News and Novel datasets. Based the approach proposed al., used auxiliary train model improve performance. Consequently, average F1-score obtained of was 0.6704 dataset, whereas that our 0.6801. Our thus improved performance approximately 1%. The improvement also confirmed dataset. Furthermore, two-tailed paired t-tests indicated there statistical significance between two approaches.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi - Paragraph Segmentation of ExpositoryTextsMarti

We present a method for partitioning expository texts into coherent multi-paragraph units which reeect the subtopic structure of the texts. Using Chafe's Flow Model of discourse, we observe that subtopics are often expressed by the interaction of multiple simultaneous themes. We describe two fully-implemented algorithms that use only term repetition information to determine the extents of the s...

متن کامل

Multi-Paragraph Segmentation of Expository Text

This paper describes TextTiling, an algorithm for partitioning expository texts into coherent multi-paragraph discourse units which reeect the subtopic structure of the texts. The algorithm uses domain-independent lexical frequency and distribution information to recognize the interactions of multiple simultaneous themes. Two fully-implemented versions of the algorithm are described and shown t...

متن کامل

Genre-Based Paragraph Classification for Sentiment Analysis

We present a taxonomy and classification system for distinguishing between different types of paragraphs in movie reviews: formal vs. functional paragraphs and, within the latter, between description and comment. The classification is used for sentiment extraction, achieving improvement over a baseline without paragraph classification.

متن کامل

Learning hashing with affinity-based loss functions using auxiliary coordinates

In binary hashing, one wants to learn a function that maps a high-dimensional feature vector to a vector of binary codes, for application to fast image retrieval. This typically results in a difficult optimization problem, nonconvex and nonsmooth, because of the discrete variables involved. Much work has simply relaxed the problem during training, solving a continuous optimization, and truncati...

متن کامل

Optimal Multi-Paragraph Text Segmentation by Dynamic Programming

There exist several methods of calculating a similarity curve, or a sequence of similarity values, representing the lexical cohesion of successive text constituents, e.g., paragraphs. Methods for deciding the locations of fragment boundaries are, however, scarce. We propose a fragmentation method based on dynamic programming. The method is theoretically sound and guaranteed to provide an optima...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEICE Transactions on Information and Systems

سال: 2023

ISSN: ['0916-8532', '1745-1361']

DOI: https://doi.org/10.1587/transinf.2022edp7083